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ABSTRACT 
The use of digital computers to process various types of 
sensor data is becoming increasingly common, in both 
civilian and military applications. One example of this use 
is the enhancement of photographs to increase their clarity, 
or emphasize a particular detail. 


Previously, the computers used to perform this processing 


was done in specialized circuits, mainframe or 
minicomputers. More recently, extremely powerful 
microprocessors have become available that show potential 


to be applied in this area. 

This thesis explores a particular class of image 
processing, known as Image Segmentation, implemented on a 
particular microprocessor. The microprocessor is the 
Fairchild F9450, the first civilian version of the 1750A 
military specification microprocessor. 

This microprocessor, along with its associated chip set, 
appears well suited to image processing, having high speed 
capability, direct floating point arithmetic instructions, 
multiprocessing capacity, апа the ability to address up to 
sixteen megabytes of memory. 

Additionally, a sophisticated software development too! 
set, known as Microprocessor Pascal, is available to develop 


and test software for the 1750A/F9450 microprocessor. This 
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tool set allows software to be developed on the VAX-11/780 
minicomputer, targeted for final use on the 1750A/F9450. 

This work utilized the Microprocessor Pascal tool set to 
test and compare representative Image Segmentation 
algorithms. Тһе speeds of execution and code sizes of the 
programs were determined for the F9450/1750A microprocessor 
and the МАХ-11/780 minicomputer, and were compared to 
determine the feasibility of using the Е9450/1750А 
microprocessor for image segmentation work. 

Several images resulting from the image segmentation 


processing are included, as well as the Pascal programs used 


to perform the processing. 


[ya 


VIE 


TABLE OF CONTENTS 


ІМТКОЦСТ ГӘН нс 220 ОО 


А. 


В. 


GENERAL 


PURPOSE OFSWORK Jwe. au a ET сяк 


1750A ARCHITECTURE AND SOFTWARE TOOL SET 


A. 


Ба 


Cy 


METHOD 


A. 


B. 


HARDWARE REQUIREMENTS 
1750A/F9450 MICROPROCESSOR 
MICROPROCESSOR PASCAL TOOL SET 
OF EXECUTION SPEED ESTIMATION 
GENERAL 


METHOD OF “SPEED ESTIMATION . . ж n 


IMAGE SEGMENTATION ALGORITHMS . . . . . . è è à 


A. 


В. 


С. 


ОЕМЮВИФ .. ... с... A 


PROGRAM 1 ОРЕКАТІОМ 


PROGRAM 2 OPERATION 


ANALYSIS OF THE TEST RESULTS 


E. 


GENERAL 

COMPARISON OF PROGRAMS 1 AND 2 

ANALYSIS OF. .RESUBBS.. . .—.—— è è 25 
COMPARISON OF VAX AND 1750A/F9450 SYSTEMS 


SUMMARY + Е НО ЕЕ че 


CONCLUSIONS AND RECOMMENDATIONS . . . . . . . . 


A. 


PROBLEMS ENCOUNTERED WITH TOOL SET 


10 


10 


11 


14 


14 


15 


19 


23 


23 


23 


28 


29 


28 


30 


36 


36 


37 


38 


46 


48 


48 


В. METHODS OF IMPROVING SPEED 
C. SUGGESTIONS FOR FURTHER WORK 
APPENDIX A: IMAGE SEGMENTATION PROGRAM i LISTING 


APPENDIX B: 
LIST OF REFERENCES . . . 


INITIAL DISTRIBUTION LIST 


IMAGE SEGMENTATION PROGRAM 2 LISTING 


50 


54 


58 


71 


79 


80 


10. 


1 TER 


172 


13 


14. 


15. 


LIST OF FIGURES 


SHIP IMAGES BEFORE AND AFTER SEGMENTATION 
PROGRAMMER'S REGISTER MODEL OF 1750A/F9450 
F9450/1750A MICROPROCESSOR ARCHITECTURE 
SOFTWARE DEVELOPMENT TOOL SET 

SUMMARY OF PROGRAM TEST RESULTS 

ASSEMBLY LANGUAGE PROGRAM SAMPLE 

SAMPLE TIMING CALCULATION, BASED ON FIGURE 6 
CALCULATION OF PIXEL EDGE MAGNITUDE 

INPUT ARRAY TARGET AND BACKGROUND WINDOWS 


RESULTS OF VARYING COST FACTORS IN BAYSIAN 
PROBABILITY (PROGRAM 2) . . .... . . . u 5 


SUMMARY OF PROGRAM TEST RESULTS 

INDEX MODE/BASE RELATIVE ADDRESSING 
COMPARATIVE ASSEMBLY CODE TRANSLATIONS 
PARALLEL PROCESSING SCHEME FOR MULTIPROCESSING 


SERIES PROCESSING SCHEME FOR MULTIPROCESSING 
(PIPELINING) 


• 


12 


16 


17 


2@ 


24 


25 


26 


31 


32 


35 


36 


41 


43 


56 


57 


ACKNOWLEDGEMENTS 


I wish to gratefully acknowledge my thesis advisor, 
Professor Chin-Hwa Lee, who provided invaluable assistance 
in completion of this thesis. 

| would also like to express my gratitude to Professor 


Alex Gerba Jr. for his assistance. 


l. INTRODUCTION 


A. GENERAL 

The application of image processing is expanding into 
many new areas including the military. In many cases the 
need exists to enhance a desired image often in the presence 
of background clutter, to allow target identification, etc. 
This is often done by an automated system. One type of such 
processing is the method of Image Segmentation. 

Image Segmentation involves the conversion of an image 
with multiple levels of gray values (which can represent 
color, brightness, or infrared radiation as examples) into a 
"binary" image, which has only two levels. This has the 
effect ої converting a "half tone" image into a "black and 
white" опе. Figure 1, as an example, shows the input and 
output images from an [mage Segmentation system. The input 
15 а ship image composed of pixels which vary over a range 
from zero to two hundred and fifty five. The output image is 
the same ship where the image pixels have only two values; 
zero апа опе. This process has the additional effect of 
removing a great deal of the background clutter. 

Like most computer graphics applications, image 
segmentation is a very "CPU intensive" process; requiring a 
large amount of computation. Performing this type of 


processing in real time will require very high speed in both 
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hardware and software. For this reason such processing is 
often done with specialized, custom designed hardware. In 
this study, the possibility of efficiently performing image 
segmentation with a standard, general purpose microprocessor 


is explored. 


B. PURPOSE OF WORK 

The purpose of this thesis is to determine and compare 
the speed of image segmentation in two different computer 
architectures: the 1750A/F9450 microprocessor, and the VAX- 
11/780 minicomputer using the УМ5 operating system. 
Comparisons will be made in terms of actual speeds ої 
execution, sizes of generated code, and overall efficiency. 
From these factors, it should be possible to determine which 
method is more appropriate for a given application. While 
the images used for this work are infrared images of ships, 
the techniques used are applicable to a wide range of 
applications апа sensor types, including areas such as 
geological surveys by aerial photography, medical imaging, 
and so forth. 

It should also be noted that, while a particular 
microprocessor and software development system is used here 
for this work, it is not a unique selection, and other 
combinations of tools could be equally applicable. 

This work will present a description of the hardware and 


software used, give a brief discussion of two representative 
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image segmentation algorithms, and present the results о? 
the comparison between the two algorithms. 

It was necessary to determine the  175ОА/Е9450 CPU 
operating speeds indirectly (for reasons to be discussed). 
The method which was derived for doing this will also be 
explained and demonstrated. 

Finally, the results derived will be analyzed, and а 
rational to explain them will be discussed relating to the 
actual merits of the 1750A/F9450 microprocessor, versus the 


VAX-11/780, for image segmentation processing. 





Before Processing: 





After Processing: 


Figure 1. Ship Images Before and After Segmentation 
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It is discovered that, in general, the 1750A/F9450 
microprocessors are capable of performing image segmentation 
efficlently, but not normally fast enough for real time 
operations, unless certain special methods are used. 

Possible methods of increasing the speed of operation are 


presented. 
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Il. 1750A ARCHITECTURE AND SOFTWARE TOOL SET 


A. HARDWARE REQUIREMENTS 

As stated earlier, virtually any type of image processing 
makes great demands on the hardware and software used. Image 
segmentation is no exception. 

The first hardware requirement is the need to store and 
process relatively large amounts of data. This results from 
the fact that images require at least two dimensional data 
arrays, and each pixel requires enough bits to represent the 
desired number of intensity levels in the image. In some 
савев the large amount of memory required may be reduced 
somewhat by means of efficient algorithms which require only 
а small portion of an image to be processed at a time 
(through such means as "overlap and save" methods о? 
convolution). In general however, the trend is towards "real 
time" systems with large capacity, such as operator displays 
in aircraft and medical imaging systems. 

The second hardware requirement is for the processor to 
operate at sufficiently high speeds to meet the design 
needs. | f the processor is to analyze only off-line data, 
the speed requirement is not as great. Many military and 
industrial systems however, often require the processing 


work to be done in real time. This creates the need for a 
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microprocessor to operate at higher speeds than those 


previously available. 


B. 1750A/F9450 MICROPROCESSOR 

The microprocessor and software development tool set 
selected for this work are capable of supporting the memory 
and speed requirements just stated. 

The microprocessor selected for use is the Fairchild 
F9450 16-bit microprocessor, which is a civilian version of 
the military 1750A microprocessor. The programmer's register 
diagram of the 1750A/F9450 is shown in Figure 2 (Ref. 147 
The "block diagram of the actual сһір architecture is 
illustrated in Figure 3 [Ref. 2). 


As illustrated in Figure 3, the 1750A microprocessor 


architecture has five sections: data processor, 
microprogrammed control, address processor, interrupt and 
fault processor, and timing unit. The data processor allows 


use of a variety of data types and direct floating point 
operations instructions. The address processor uses an 
independent  incrementer for the Instruction Counter, and 
also allows a wide range of addressing modes for the 
microprocessor. The interrupt processor and timing units are 
especially useful for multiprocessor operations, as will be 
discussed later. 

The architecture is similar in overall conception to the 


VAX-11/780, but lacks some features. One example is the lack 


TS 


of а separate numeric coprocessor, Similar to the VAX's 
Floating Point Accelerator option. Additionally, the 
1750А/Ғ9450 lacks any built-in facilities for direct 
implementation of "virtual memory." 

The 1750A/F9450 instruction set has а number of 
instructions to make use of its powerful architecture. Among 
these are instructions to control the two on- chip timers, 
and a Built -In Function to allow the direct use of user 


defined instructions. 





PENDING INTERRUPT 


FAULT REGISTER 


INSTRUCTION COUNT 


STATUS WORD 


SYS. CONFIG. REG. 


TIMER A 
TIMER B 





Figure 2. Programmer's Register Model of 1750A/F9450 
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Тре 1750A/F9450 CPU is a highly sophisticated 
microprocessor, as сап be seen in Figures 2 and 3. [t 
includes sixteen 16 bit general purpose registers, а 16 bit 
Status Word register and a System Configuration Register in 
its internal architecture. Тһе general purpose and Status 
Word registers are very similar in concept to the VAX-11/780 
architecture, which uses sixteen 32 bit general purpose 
registers, and a 32 bit Processor Status Longword register. 
(Of course, 32 bits allow a greater range of instructions, 
and greater data accuracy. The architecture itself however, 
is quite similar.) The 1750A/F9450 System Configuration 
Register contains information relating to the chip's 
external environment, such as the presence or absence of an 
additional microprocessor, memory protection unit, or block 
protection unit, and the interrupt mode in use. The VAX 
system doesn't use a configuration register, and it is 
normally installed in a more standardized configuration. 

The 1750A/F9450 CPU is capable of operating at clock 
speeds of up to twenty megahertz. This microprocessor is one 
component of a chip set which also includes a Memory 
Management Unit (the F9451) and a Block Protect Unit (the 
F9452). Alone, the microprocessor is capable of addressing 
up to two million 16 bit words of random access memory, and 
up to twenty million words with the Memory Management Unit. 
The 1750A/F9450 is highly optimized for real time operation. 


The features to achieve this capability include a 
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sophisticated 16 vector interrupt handling system, built-in 
multiprocessor capabilities, and two programmable timers on 
the chip. This microprocessor also features 32 and 48 bit 
floating point arithmetic, built in self-test upon power up 
or reset, and fault handling capabilities. This architecture 
is highly advanced for a microprocessor, but it is not 
comparable in overall capability to a powerful minicomputer 
system such as the VAX system, which is the architecture for 
comparison, as the VAX system is designed Тог multi 
user/timesharing systems. 

One of the significant differences between the two 
systems is the size of the assembly language instruction 
sets. The 1750A/F9450 has 141 instructions in its set, while 
the VAX has over 240. This greater flexibility should enable 
a VAX compiler to convert a high level language statement 
into а lesser number of assembly language statements than 
the 1750A/F9450 compiler would require. Another advantage of 
the VAX system, is a richer range of addressing modes. This 


will be discussed later in this thesis. 


C. MICROPROCESSOR PASCAL TOOL SET 

The software development system selected for use is 
called Micro Processor Pascal (MPP), and was developed by 
Texas Instruments for use with the 1750A/F9450 
microprocessors. lt is a complete tool set for software 


development, allowing software for the 1750A/F9450 to be 
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developed оп а VAX-11/780 minicomputer targeted for final 
use on the  175ОА/Е9450 microprocessor. The tool set 
utilizes a superset of standard ANSI Pascal, and adds 
facilities to use the 1750A's multiprocessor/multitasking 
capabilities. The tool set includes а compiler, an 
assembler, a binder and linker, a reverse assembler (to 
generate assembly code from the compiler output) and a 
debugger-simulator. The components and operation of the tool 
set is shown in Figure 4 [Ref. 3]. The Reverse Assembler 
which is crucial to the work done here, is particularly 
useful for allowing hand optimization of a program. This 
manual] tuning of code would allow increased speed of 
operation, for time critical programs, as a skilled 
programmer normally writes more efficient code than a 
compiler. 

Another optimization feature of the tool set 15 the 
ability of the compiler to partially optimize the object 
code itself during the compilation. This is dependent upon 
the programmer using certain programming conventions as 
described in the MPP/1750A User's Manual. For example, it is 
found to be faster and more efficient to pass parameters to 
a procedure by reference than by value. Also, the IF-THEN- 
ELSE Statement is faster than a corresponding CASE 
statement, if the possible paths can be handled by an IF 
statement. Even the ordering of variables and data types in 


the declaration portion of the program is found to affect 
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the execution speed. Further details can be found in the 
User's Manual. 

This tool set was used to write, debug, and test Pascal 
versions of the two image segmentation algorithms studied 
here. In addition to determining execution speed estimates 
using the microprocessor tool set, the same algorithms 
were also compiled and run under VAX Pascal. This was to 
allow comparison of the relative speed of execution, and 
compi ler code size generated in the two different 


environments. 


za 
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Figure 4. Software Development Tool Set 
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111. METHOD OF EXECUTION SPEED ESTIMATION 


A. GENERAL 

One of the main purposes of this work was to determine 
the speed with which the 1750A/F9450 microprocessor could 
process the image arrays in representative segmentation 
algorithms (to be described later). Due to a lack of ап 
actual microprocessor system to run the programs on, а 
method of estimating processor executing speed indirectly 
had to be found. 

As discussed earlier, the tool set Reverse Assembler 
allows the generation of assembly language programs from the 
compiled Pascal source code. From this reverse assembled 
code, it was possible to calculate the total number of 
executions of a particular instruction ( a "JMP" or "CALL" 
for instance). The Preliminary Data Sheet of the 1750A/F9450 
processor contains timing data specifying the amount of time 
that a given instruction takes to execute. Combining these 
pieces of information, it is possible to estimate execution 
times of the assembly language program. The assembled code 
size (in number of lines of assembly code) was readily 
obtained by studying the code listings produced by the VAX 
and Microprocessor Pascal compilers respectively. The speed 


estimate is of course, not as accurate as actual operational 
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tests on a 1750A/F9450 microprocessor, but it should be a 
reasonable representation of the processor's performance. 


The final results are summarized in Figure 5. 


Program 1 Program 2 


Pascal 462 lines 382 lines 
Source 

Code 

VAX 548 lines 597 lines 
Assembly 

Code 

MPP 911 lines 1367 lines 
Assembly 

Code 

VAX 8.31-8.41 sec 14.37-14.9 sec 
Execution 

Time 

MPP 8.78-8.91 sec 14.24-14.8 sec 


Execution 
Time 


Figure 5. Summary of Program Test Results: 


The two programs, and the meaning of each item in the 
table will be discussed in detail in Chapter 5, but it сап 
be seen at a glance that the 1750A/F9450 microprocessor 
should be at least comparable overall in speed to the 
powerful VAX-11/780 minicomputer. This speaks eloquently of 


the power of this microprocessor. 
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В. МЕТНОР OF SPEED EXECUTION ESTIMATE 

As to the actual method of execution speed estimation, a 
brief but representative example is now presented. After a 
Pascal program has been compiled by the microprocessor tool 
set, the Reverse Assembler is used to generate an assembly 
language version of the same program. А small sample of ап 
assembly language program is shown in Figure 6, and will be 


discussed in this example. 


LOOOA EQU $ 
LIM 62, OOFSA 
СВ Ri4, 00003 
BLT 10027 
LIM R12, OA4OS3E, R13 
LB R14, 00002 
MSIM R2, OOSFB 
AR R1Z, R2 


A К12, 00003, R13 
LR R4, R12 
STC 0, 00000, Ка Loop Iteration Path 


LIM Ri2, ОЗВЗЕ, RiS 
LB Ri4, 00002 

MSIM R2, OOSFB 

AR R12, R2 

A R12, 00003, R14 
LR Ка, R12 

STC 0, 00000, ка 
INCM 1, 00003, R14 
BR LOOOA 


Figure 6. Assembiy Language Program Sample 


The code shown is а small portion of the assembly 


language program from one of the two algorithms used. lt is 
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a loop, as shown by the arrow, and it is executed a total of 
1530 times. From the known number of executions of the loop, 


the number of each type of instructions contained in the 


loop, and the timing information from the Preliminary Data 
Sheet, it is possible to perform the calculations shown in 
Figure 7. 


Type of Instruction: 


Load/Store Add/Subtract Compare Jump Multiply/Divide 
LIM: 3 INCM: 1 CB: 1 BLT: 1 MSIM: 2 
LR: 2 з 2 ВЕ: 1 
LB: 2 АК: 2 
5ТС: 2 
9 5 1 2 2 
x .2 uS x .2 uS x .4 uS х 55485 х 1.85 us 
1. Э US + 1.0 us + 90.4 u5 + 1.0 мо 42 Je us 


= 7.9 uS/iteration of loop 


1530 iterations of loop x 7.9 us/iteration of loop 


= 3.094 seconds 


( Note: the "EQU" takes no execution time. ) 


Figure 7. Sample Timing Calculation, Based On Figure 6. 
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As can be seen, the calculation is a relatively simple 
application of arithmetic, but if based upon accurate 
timing data, the method should yield reasonably accurate 
estimates. 

The calculations, as already noted, are not difficult, 
especially for small sections of code as demonstrated. 
However, for the actual programs, such as those used in this 
thesis, where there are hundreds of lines of code, the work 
becomes laborious, and error prone due to miscalculation 
and other human errors. lf this method were necessary for 
extended use, it might be possible to automate the process, 
to allow a computer to produce the timing estimates. 


Because the calculations here were done by hand, there is 


a definite possibility of human error, however the 
calculations меге rechecked, во any error should be 
relatively small. Since the method is only an estimate of 


the execution speed, it is expected that there will be some 


errors inherent in the method. 
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IV. IMAGE SEGMENTATION ALGORITHMS 


A. GENERAL 

As the purpose of this work is to study the effectiveness 
of the  175ОА/Е9450 microprocessor in implementing Image 
Segmentation, two representative methods of segmentation 
were selected for testing. Both methods yield similar 
outputs for similar input data, but use different algorithms 
to process the input data. Both methods were written in 
Microprocessor Pascal, and the Pascal listings of each 
program are included in Appendix 1. The two methods will 
hereafter be referred to as Programs 1 апа 2. 

In order to compare and contrast the actual algorithms 
most accurately, the two programs share as many procedures 
as possible. Among other procedures, the two programs share 


identical input and output procedures. 


B. PROGRAM 1 OPERATION 

Program 1 uses a relatively simple threshold scheme. The 
input data array is read from a disk file into the program's 
data array for processing. This image data array is 256 rows 
Бу 64 columns in size, and each element of the array is a 
byte (an integer between O and 255) representing the gray 


level of a pixel in the input image. 
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The program is written on the assumption that the image 
consists of a target positioned near the center of the 
image, surrounded by background. The program initially 
measures a histogram of the background intensity values by 
processing the left and right hand most 16 columns. This 
histogram is an array of the number of pixels having a given 
intensity versus that intensity. 

Following histogram generation, a value representing the 
average background intensity distribution, is computed by 
dividing the sum of all histogram intensity values by the 
number of intensities having nonzero values in the 
histogram. Finally, a limit value is generated by 
multiplying the average background intensity distribution 
value Бу an empirical threshold value which is pre-selected 
by the user. 

Once the limit value is computed, it is used to process 
the input image array into a binary output array of the same 
dimensions. Each image pixel's intensity is read, and the 
number of pixels having the same intensity value is 
determined Бу checking the histogram. If this number of 
pixels is greater than or equal to the precalculated limit 
value, the corresponding binary pixel is set to one. If the 
number of pixels is less than this limit, the corresponding 
binary pixel is set to zero. The entire binary image is 


generated in this fashion, pixel by pixel. 
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This threshold technique tends to generate a significant 
number of false target and false background pixels, which 
will appear as random "noise" in the binary image. To 
eliminate these false pixels, Program 1 uses a final 
filtering procedure called "REMOVE". This procedure compares 
each pixel of the binary array, with those surrounding it. 
lf the center pixel has one value, while the surrounding 
ones are all of the other value, it is assumed that the 
center pixel is a false one, and its value is reset to the 
opposite value. 

The entire scheme is dependent on the assumption that the 
image of the target is brighter overall than the background. 
However, this assumption could be reversed, by switching the 


inequality in the conversion process. 


C. PROGRAM 2 OPERATION 

The second program is similar in overall operation, and 
data flow, but uses a more sophisticated a рог 1 ОШ, to 
perform the processing. Whereas the first program uses only 
a single pixel attribute (intensity), to determine whether a 
pixel is a target or background, the second program (also 
listed in Appendix 1) uses two attributes: intensity, and a 
computed quantity called "edge magnitude". The edge 
magnitude 18 а value which indicates the likelihood that а 
pixel is part of an edge, ог corner of pixels of similar 


intensity. This 15 more probable if the pixel is a part of 
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the target, xsince the background will tend to be a more 
unstructured pattern of intensities. 
The formula used to compute the edge magnitude of ап 


individual pixel is shown in Figure 8 [Ref. 4]: 


I1 2 ІЗ EO = 1рхі + (Dy! 
<= 3x3 
18 IO I4 Pixel ВЕ 2198 ғ І/) - (ІЗ + 214 + 15) 
Array 
I7 16 15 Шш- 1212 413) - (17 - 216 + 15) 


Figure 8. Calculation of Pixel Edge Magnitude. 


Аз shown in Figure 8, each pixel in turn, is viewed as 
the center of a 3 x 3 array of pixels. The Dx and Dy values 
are calculated from the surrounding pixel intensities, with 
the equations shown in Figure 8. The desired edge magnitude 
EO, is the sum of the absolute magnitudes of Dx and Dy. This 
computation must be performed for every pixel and will be 
used in the data processing. This will thus involve a great 
deal of calculation. 

As in the first program, the input image array is divided 
into a target window and a background remainder, though 
these windows need not be of the same size and/or shape as 


those in Program 1. This is shown in Figure S9. 
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TARGET WINDOW 
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Figure 9. Input Array Target and Background Windows 


Program 2 first processes the target window, and each 
pixel's edge magnitude is calculated. A two dimensional 
histogram is then developed, containing the number of pixels 
having each combination of intensity and edge magnitude, 
versus that combination of intensity and edge magnitude. 
After completing the target window, the program performs the 
same operation on the background pixels, generating a 
separate background histogram. 

The program then processes the target window pixels by 
using a Baysian probability method. For each pixel, the 
probability of that pixel being a target pixel and of being 
a в“ pixel is determined by the use ої the target 
and background histograms. If” the target window and 
background window areas were equal, the probabilities can be 


read directly off the histograms. If the areas were not 
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equal, the histogram values must be appropriately scaled. 
The program in this case used equal sized windows, avoiding 


the need for any scaling. 


For each target window pixel, the target and window 
probabilities are determined from the corresponding 
histogram values, and are inserted into the following 


inequality (Ref. 5): 


C(B:T)P(X=T) > C(T:B)P(X=B) 


Where: C(B:T) is the cost of misclassifying a pixel as 
a background pixel, if it is a target. 


P(X=T) 15 the probability that the pixel is a 
target. 


C(T:B) is the cost of misclassifying a pixel as 
a target pixel, if it is a background. 


P(X=B) 15 the probability that the pixel 15 
a background. 

If this inequality is true, the pixel being checked is 
set to one in the binary array. If the equation is false, 
the pixel is set to zero. 

The two cost factors C(B:T) and C(T:B) are constants that 
the user preselects. The most appropriate value will depend 
upon the application, апа the input data being processed. 


One likely situation is to set the two equal in value. If 


this is done, the minimum number of pixels will Бе 
misclassified, though there will still be some 
misclassifications. In Figure 10, the same image was 


33 


processed, but the cost values were changed for each run, to 
show the effects of varying these cost values. 

In the algorithms used in this work, the analysis was 
based on the two pixel attributes previously stated. 
However, the same Baysian probability system can be modified 
to handle three or more attributes. 

One of the advantages of Program 2, is that if the cost 
factors аге properly selected, it generates less of the 
random noise mentioned earlier, than Program 1. This can 


eliminate the "Remove" procedure required by Program 1. 
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Figure 10. Results of Varying Cost Factors іп Baysian 
Probability (Program 2) 
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V. ANALYSIS OF THE TEST RESULTS 


A. GENERAL 

The two image segmentation algorithms discussed were both 
run for the 1750A/F9450 and VAX systems respectively. The 
time of execution was actually measured for the VAX system, 
and calculated for the 1750A/F9450 microprocessor. The code 
size was determined for each, and all the results were 
reported in Figure 5. This table is repeated in Figure 11, 


for easy reference. 


Program 1 Program 2 


Pascal 462 lines 382 lines 
Source 
Code 


VAX 548 lines 597 lines 
Assembly 
Code 


MPP 911 lines 1367 lines 
Assembly 
Code 


VAX 8.31-8.41 sec 14.37-14.9 sec 
Execution 

Time 

MPP 8.78-8.91 sec 14.24-14.88 sec 


Execution 
Time 


Figure 11. Summary of Program Test Results: 
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B. COMPARISON OF PROGRAM 1 AND 2 

The primary purpose of this study is to determine the 
applicability of the 1750A/F9450 CPU as to implement image 
segmentation algorithms. Based upon the MPP and VAX 
execution times shown in Figure 11, the immediate answer 
would seem to be that it is indeed, if the VAX itself is 
adequate. For both Programs i and 2, the execution times of 
the two methods are virtually identical, differing by only a 
fraction of a second. The fact that the two times are almost 
identical in this case, suggests that, not only does the 
Microprocessor Pascal tool set allow the programmer to 
develop 1750A/F9450 software on а VAX minicomputer, but that 
program execution times may be estimated by executing the 
same programs in the МАХ Pascal system, rather than 
calculating them as was done in this work. 

It should be noted here, however, that the execution 
speeds аге somewhat variable, as indicated by the range of 
times in the table. Part of this is due to the variance in 
input images, which will affect processing time. It would 
also be affected somewhat, in the VAX case by the presence 
or absence of a Floating Point Accelerator. The accelerator 
would not be expected to make a significant difference in 
this particularly work, because neither program makes 
extensive use of floating point operations, instead they use 


byte and integer values. 
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C. ANALYSIS OF RESULTS 

Sheer execution speed is not the sole criterion for 
determining the value of a given hardware or software 
system. Other factors can include the support requirements 
of the hardware, the memory requirements of the software 
(such as array size, etc. ), and any other specialized user 
needs. 

In this study, where the chip used was a version of a 
military microprocessor (the 1750A), a significant 
restriction is the memory requirements. This is the case, as 
the microprocessor might be installed in an aircraft, 
missile, or other vehicle where space and weight are 
critical factors. This can limit the amount of physical 
memory circuitry that can be used, regardless of the amount 
of logical memory that the microprocessor can actually 
address. 

In image processing, large arrays are normally used to 
store the image data. One method of attempting to minimize 
the storage requirements of these arrays is to use "packed 
arrays" to store data. This can reduce array storage 
requirements by approximately one half. Packed arrays can 
have the unfortunately additional effect of increasing 
execution time, if the system is inefficient in dealing 
with packed data. The VAX has a variety of data types, that 
allows efficient implementation of the packed arrays. In 


particular, there is a Packed Decimal String data type іп 
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the VAX system. The 1750A/F9450, unfortunately, does not 
have such а data type. This requires the Microprocessor 
Pascal system to use procedures (parts of the run time 
support library) to pack and unpack the data. This imposed 
a significant amount of the execution time estimates for the 


1750A/F9450, of both Programs 1 and 2. In Program 1, for 


instance, approximately 3 of 8 seconds of execution tíme was 
spent by the 1750A/F9450 system, in packing and unpacking. 

Another significant difference in the use of memory, is 
the size of the program itself. This information is 
contained in Figure 11, in terms of the number of lines of 
assembly code for each program, of each system. 

In this comparison, the VAX minicomputer has а. 
significant advantage. As shown in Figure ii, the first 
program had 462 lines of Pascal source code. The VAX system 
translated this into 548 lines of assembly code, апа the 
1750A/F9450 required 911 lines of assembly code to do the 
same thing. This shows that the VAX compiler needed only a 
1.19:1 ratio in memory expansion to accommodate the compiled 
code, while the microprocessor needed a 1.97:1 ratio. For 


the second program, the ratios were 1.56:1 апа 3.58:1. 


D. COMPARISON OF VAX AND 1750A/F9450 SYSTEMS 
While the two programs produced significantly different 
ratios between the two systems, the МАХ system is 


consistently on the order of twice as efficient as the 
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1750A/F9450 microprocessor tool set. This is a significant 
difference, especially considering the nearly identical 
execution times. The difference in assembly code size igs 
obviously a matter of concern, since it may be possible to 
improve the situation, if the cause can be found. 

One obvious possibility is the efficiency of the compiler 
in each system. The VAX system is a commercially available 
system, and is relatively mature, having gone through the 
normal revisions as required over a number of years. The 
1750A/F9450 tool set is the first version of a 
microprocessor system, intended largely for military use. 
Most likely it will be improved in later versions, but this 
doesn't solve the immediate problem. 

This situation may be improved somewhat, by two methods. 
Firstly, a skilled programmer can take greater care in 
writing the Pascal version of the program, making it more 
efficient. It may be possible, for example, to replace a 
long sequential portion of code, by a shorter loop, which 
may require less assembly code to implement. Other methods 
of improvement are those stated earlier, such as improved 
parameter passing, and the use of IF-THEN-ELSE instead of 
the CASE statement. Secondly, the Reverse Assembler and 
Assembler can be used to optimize the assembly code itself. 
This manually optimized code can then be incorporated into 


the desired program. 
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Another significant advantage of the VAX system, is the 
larger instruction set and types of addressing modes it has, 
compared to the 175ОА/Е9450. One very useful addressing 
mode, shared by the two systems, is known as the Index 
Addressing Mode by the VAX system, and the Base Relative 
Mode in the 1750A/F9450. In each case, this mode allows the 
use 01 ап index register to specify the index of an array 
entry, thus specifying which element of the array is being 


addressed. This is shown pictorially in Figure 12 (Ref. 6]. 
















TABLE 
(lo #3) (le "TABLE") 





ELEMENT 2 






NOTE: SYSTEM MUST 
KEEP TRACK OF 
ELEMENT SIZE 


ELEMENT 3 







ELEMENT 4 





Figure 12. Index Mode/Base Relative Addressing 


This mode 18 particularly useful in array intensive 
programs, and both of the programs used in this work make 
frequent use of data arrays. Unfortunately, the 1750A/F9450 
Base Relative mode allows only a 256 offset from the base 


address, which limits its usefulness in this work. The 
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smallest array used in either program contains 16k arrays, 
which is well beyond the capability of the 256 17504 offset. 

The powerful VAX system allows an eight, sixteen or 
thirty two bit offset values, allowing a potential four 
gigabyte offset, and can thus easily handle our 16k arrays. 

This gives the УАХ а significant advantage over the 
microprocessor. The VAX can handle the array offsets іп 
hardware, while the 1750A/F9450 must do it in software, with 
the compiler generating a variable to perform this function. 
This is one instruction which can account for the larger 
microprocessor assembly code. 

As an example of how significant this type of index 
addressing can be, an example is presented. In Figure 13, а 
small sample of Pascal is listed, along with the VAX and 
1750A assembly code translations of it. The difference in 
size is obvious, and the reasons for the МАХ code being 
significantly smaller will now be explored. 

It is not necessary to have a complete Knowles of 
assembly code for either system to see that there are 
significant differences in the manner in which the two 
systems translate the code. One immediate advantage of the 
VAX system, is the fact that even at the assembly code 
level, the system uses the same identifiers as the Pascal 
source code. This is shown in statements such as  "MOVL 
INFILE,R3". The 1750A/F9450 assembly code on the other hand, 


uses only register numbers to perform the same function. 
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Разса | 


1$: 


2%: 


3$: 


5$ 
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Source Code: 
FOR I:= 1 TO 64 DO 
BEGIN 
FOR J:= 1 TO 256 DO 
IMAGE[1,J]:= INFILE[J1]1; 
END; 
МАХ-11/780 1750A/F9450 
MOVL #1,R12 L R13, LEX$1,R9 
NOP STC 1,0000,R14 
NOP 50004 ЕОЏ $ 
MOVL R12, I LIM R2,00040 
MOVL #1,RO СВ R14,00000 
NOP BLT 1002р 
NOP STC UN 0000 ва 
MOVL RO, J LOOOA EQU з 
INDEX J, #1, #256, #1, #0, Ri LIM R2,00100 
INDEX 1, #1, #64, #256, #0, R2 СВ R14,00001 
ADDL2 R1,R2 BLT LOO2A 
MOVL І МЕТЕ, ЁЗ LIM R12,00005,R13 
MOVB -1(R3)(R1], IMAGE-257[R2] LB R14,00000 
AOBLEQ #256, RO, 3$ SISP RA 
CMPL I, #64 SECLE R2, 7 
BGEQ 5$ AR R12,R2 
PUSHAB INFILE PSHM R12,R12 
CALLS #1,PAS$GET LB R14,00001 
AOBLEQ #64,R12,2$ PSHM R2, R2 
RET ЈЕ R2,04105, R13 
LR R12,R2 
LB R12,00000 
PSHM R2, R2 
LB R14,00000 
SISP R2,1 
SLL R2,8 
AB R14,00001 
SISP 201 
РОРМ ЕЗ,ЕЗ 
CALL LDPIS$8 
POPM R4,R4 
SISP R4,1 
POPM R3,R3 
CALL STPI$8 
INCM 1,00001,R14 
BR LOOOA 
LOOZA EQU $ 
INCM 1,00000,R14 
BR LOOO4 
LOO2D EQU $ 
END 


Figure 13. Comparative Assembly Code Translations 


Upon study of Figure 13, it is possible to find some of 
the reasons for the shorter VAX code. 

In the МАХ code, three lines allow the use of the 
powerful МАХ Index Addressing Mode. The two lines starting 
with "INDEX", allow the generation of values in R1 and R2 of 
the positions of the desired data element based upon an 
input index (1 or J), an offset value (0 here), and the data 
element size in bytes (1 in this case). More succinctly, for 
the first INDEX, Ri = (0+J)*1, and for the second, R2 
=2(0+1)%256. These two values are added, and used as the 
index to address the infile array. The line to use the 
index, is MOVB  -1(R3)tR13,1]MAGE-257[R21. This line 
instructs the system to move a byte offset from the first 
element of INFILE, held in R3, by the number of bytes held 
in Ri, into the position specified in IMAGE-257[R2]. 

The VAX code of course, uses nested loops, as indicated, 
to execute this sequence 16k times. To do this, it makes use 
of AOBLEQ ("Add one and branch if less than or equal") 
statements. The actual command to "get" the infile, is the 
CALLS #1, PAS$GET which makes use of a system call. 

The code generated by the 1750A/F9450 system is neither 
short, nor easy to understand, as it makes use of a more 
primitive set of assembly instructions. As indicated Бу 
Figure 13, almost as much code is devoted to maintaining 
track of the nested loop indices, as the VAX uses for the 


entire operation. The loop counters are maintained in two 
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locations in memory, R14 + OOOO and R14 * OOO1 respectively. 
These аге the locations determined by the contents of R14, 
offset by zero and one byte. The portion of code within 
brackets, 15 the code concerned with actually reading the 
infile data into the image. 

The study of Figure 13 will show that the 1750A must use 
three separate PUSH's onto the stack (PSHM's) and three 
separate POP's (POPM's) to produce the addresses necessary 
to identify the desired infile and image bytes to read and 
write. This is because the 1750A, as stated earlier, can 
only offset a maximum of 256 from a specified starting 
point. To overcome this, the code must "manually" generate 
the desired indices, by reading the aforementioned R14*0000 
and shifting the high order bytes left ( the "SLL" commands) 
and manually adding terms to produce the needed terms. 

The "CALL LDPI$8" and “CALL STPI#8" lines are the system 
calls required to allow the 1750A system read bytes from a 
packed array ("INFILE") and write bytes to another packed 
array ("IMAGE"). 

In general then, it can be seen that the capability to 
directly operate on larger array indices directly, would 
significantly improve code size in the 1750A/F9450 system, 
and could also improve processing time. This would be even 
more significant for systems using larger arrays than аге 


used here. 
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One useful addressing mode possessed by the VAX system, 
but not by the 1750A/F9450, is the Auto Increment/Decrement 
mode. In this mode, the system automatically increments or 
decrements the loop index, as required. This is particularly 
useful іп the programs used here, as both algorithms use 
large numbers of loops, апа nested loops in particular. 
Because of this, any technique such as Auto 
Increment/Decrement, is bound to improve the speed with 
which either Program 1 or 2 will execute. Unfortunately, 
unlike code optimization, new addressing modes cannot be 
readily implemented into ап existing system such аз the 
1750A/F9450 microprocessor. Thus, this particular 
shortcoming cannot бе easily remedied. The addition of а 
Memory Management Unit, such as the aforementioned  Е9451 
could impair memory access times, апа thus degrade the 


situation further. 


E. SUMMARY 

Іп summary, the 1750A/F9450 would appear comparable 
overall to the VAX minicomputer in image segmentation speed, 
but not іп the amount of memory needed to implement such 
algorithms. 

Assuming the memory requirements of the 1750A/F9450 
microprocessor were not objectionable Рог а given 
installation, the next decision would be to determine what 


the maximum allowable time for processing an image could be. 
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This would of course be dependent upon the application being 
used, so it is not possible to give a hard and fast answer 
as to the applicability of the A1750A/F9450. Some general 
guidelines may be given however. 

If Figure 11 is reviewed it can be seen that, using the 
tested algorithms, the best processing time would be with 
the first program, апа that approximately 8.78 seconds is 
required. If the code were highly optimized at both the 
Pascal and assembly code levels, it is reasonable to expect 
perhaps а 10% improvement in this. This would result in 
approximately a 7.9 second conversion time. 

If the image being processed were "off-line", such as a 
medical X-ray, or certain industrial quality control 
applications, the wait of eight seconds might not be 
objectionable. This might also be true for some military 
applications such as a long range sonar, where the signal 
itself may take something on the order of seconds to reach a 
target and return. 

Many applications however, such as a missile sensor or a 
pilot's "heads up display" require a much faster processing 
of data. It would not be reasonable for a pilot to expect 
his sensors to take eight seconds to update, as a target 
might very well move out of range in that time. 

If it is necessary to attempt to use the 1750A/F9450 in a 
role such as real time image segmentation, some way must be 


found to speed up the processing. 
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VI. CONCLUSIONS AND RECOMMENDATIONS 


A. PROBLEMS ENCOUNTERED WITH TOOL SET 

The software tool set used in this work is а powerful 
system. Like any system, it is not perfect however, and some 
difficulties were encountered. 


One problem surfaced when we attempted to compile program 


й». Тһе compiler, as would be expected, has a number of 
default settings which control the compilation unless 
altered by the user. While these default settings caused no 


true problems, the user must be aware of these settings 
(Ref. 71. First, the system defaults to a 72 column maximum 
setting. This can cause numerous error messages if a program 
is transported from a system which uses a standard 80 column 
line, until the compiler default is changed. 

Another default which could cause some problems unless 
changed, is the fact that the tool set compiler does not 
routinely check array indices for out of bound conditions 
unless this feature is specifically activated. This is a 
helpful feature for such array intensive programs as image 
segmentation, and the user should be aware that this feature 
is normally off. 

More significantly, the Microprocessor Pascal tool set 
deals somewhat differently with certain standard Pascal 


procedures than might be expected (Ref. 8). It was 
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discovered for example, that to open a disk file, such as 
the image files used, one needed to use not the expected 


"OPEN" procedure, but instead either "RESET" or  "REURITE" 


alone. lt was discovered that these procedures both open 
and reset files for read and write operations. lt was also 
discovered that the procedure "CLOSE" 18 ап external 


procedure, and must be declared as such. 

The next difficulty occurred when program 1 had been 
successfully compiled. When it was attempted to link the 
program, numerous error messages were generated, indicating 
that the system was unable to locate a series of procedures 
required by the main program. These procedures, bearing such 
names as F$GET and L$RD, were not user created, and it was 
found that they were supposed to be part of the system's Run 
Time Support library. The library was checked, and they were 
indeed not included. 

At first it was feared that the missing procedures had 
somehow been accidentally erased or destroyed. Upon further 
study however, it appeared that all of these procedures were 
involved with the input or output of program data. This 
appeared to be the case, since the names could be mnemonics 
for such operations as "file get" and "line read". 

After contacting development personnel at Texas 
Instruments, it was determined that the procedures were 
intentionally missing. The 1750A/F9450 microprocessor was 


intended for applications in a wide variety of applications, 
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and thus would need to interface with a wide variety of 
peripheral equipment, disk drives, terminals, and even real 
time systems such as sensors. Because of this, it was 
necessary to keep the 1750A/F9450 as device independent as 
possible. To do this, the input/output routines were not 
implemented (though the names such as F$GET were). This 
would allow (in fact require) the user to develop the 
routines necessary to perform input/output operations with 
the user's particular equipment. 

It was the lack of input/output capabilities in the tool 
set as well ав the lack of ап 1750A/F9450 hardware 
development system, that dictated the need to develop а 
means of determining 1750A/F9450 execution speeds 
indirectly. Even if these routines were in place however, 
the speed with which a Microprocessor Pascal program ran оп 
a VAX minicomputer would not be expected to be the same as 


on an 1750A/F9450 microprocessor. 


B. METHODS OF IMPROVING EXECUTION SPEED 

As described previously, the 1750A/F9450 was found to be 
too slow in execution speed for real time applications. 
Therefore, if it is still necessary to use an F9450 or 1750A 
microprocessor in real time image processing, it will Бе 
necessary to find some method of increasing either its speed 


or the system's actual throughput. 
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Pf, in a given application, memory limits are not a 
problem, a significant improvement could be made in 
execution time by using “unpacked” arrays instead of 
"packed" arrays. The data arrays used in the tested programs 
were 16 kilobytes in packed size. These would approximately 
double in size if unpacked. If the 1750A/F9450 ina given 
installation could use multiple megabyte sized memory, it 
would be feasible to use such unpacked arrays, and thus 
speed up processing significantly. In program 1 for example, 
the execution time would go from approximately 8 to 
approximately 5 seconds, based upon the execution time 


estimates. (Due to the elimination of packing/unpacking 


times.) 
Another option to speed up processing, is to make use of 
multiple processors. This could be done in two possible 


ways: operate the processors in parallel, or operate them in 
series. Each of these choices offer different methods of 
improving the processing time. 

In studying the operation of the two programs, (as listed 
in Appendices A and B) it becomes apparent that there are 


two main operations involved: histogram generation the 


background and target attributes of the pixels, and 
generating the binary output arrays’ based on these 
histograms. In some cases, it may be possible to perform 
these operations by two different processors. If the 


processors are working on the same operation, they are said 
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to be working in parallel. If the processors are processing 
different operations, they are working in series, which is 
sometimes also referred to as "pipelining". 

In parallel operation, as shown in Figure 14, for program 
25 one processor might be generating the target window 
histogram, while the other program generates the background 
window. As the two windows are often of the same size, this 
would take almost exactly the same amount of time, and thus 
divide the total histogram generation time by a factor of 
two. Following histogram generation, the two processors 
might also process the binary image in parallel, by perhaps 
working on different portions of the image at the same time. 

One possible problem with this method, is the difficulty 
of having multiple processors addressing the same memory 
simultaneously. lf not carefully coordinated, the two 
processors might attempt to read or write to the same 
address at the same time. Fortunately, the 1750A/F9450 
microprocessor and Microprocessor Pascal tool set > quite 
well equipped to work in this fashion. іп particular, the 
multitasking capabilities of the Pascal version, and the 
1750A/F9450 itself can greatly simplify the coordination of 
multiple tasks. Additionally, the Memory Management Unit and 
Block Protect Unit in the 1750A/F9450 chip set can greatly 
simplify the problem of preventing memory contention. 

Another method of preventing memory contention, would be 


the use of multi port memory. This relatively new technology 
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allows multiple processors to access the same memory 
simultaneously. Of course, the availability ої this 
technology is not known for all the various applications of 
the 1750A/F9450. 

In the series, or pipelining case, the task of processing 
the data is also divided between the processors. However, as 
shown in Figure 15, each processor would perform only one of 
the functions, either histogram generation or generating the 
binary image. The first processor would histogram the input 
image and transfer the histograms to the second processor. 
The second processor would then use the histograms to 
generate а binary output image. After each processor is 
finished, it reads the next input image to perform the same 
operations. 

The pipelining method is somewhat simpler to coordinate 
than the parallel case, as is not a problem in having two 
processors attempting to access the same data address 
simultaneously. It is only necessary to use an interrupt 
system for each processor to alert the other when it is 
ready to transfer data from one to the other. This may not 
speed up the process as much as the parallel case, as the 
histogram generation may not take the same period of time as 
the binary generation, so that one processor may sit idle 
waiting for the other to finish. However, even if the 
processing of a single image is not as fast as the parallel 


method, the series method will normally result ina greater 
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total throughput of images. This may be especially useful if 
the system is continuously processing images, as in the case 
of a cockpit display for instance. 

This pipelining might also be a case were the Built In 
Function instruction of the 1750A/F9450 microprocessors 
might be put to use. One processor might "call" the other to 
generate histograms, and then use them to create output 
arrays. This would be easier to implement than an elaborate 
handshaking scheme. 

In summary of these two methods, the parallel method will 
tend to generate a single image more quickly, but the series 
method will tend to produce a greater total throughput of 
images. This seems to recommend the parallel method for 
individual images, and the series scheme for continuously 
updated image systems. 

For maximum improvement, some of these methods could be 
combined. The same system could make use of improved 
algorithms, unpacked arrays, and either parallel processing 
or pipelining. A combination of methods might well reduce 
the total time for image segmentation to something on the 
order of one or two seconds. This might well be fast enough 


for use in some real time systems. 


C. SUGGESTIONS FOR FURTHER WORK 
Further work remains to be done in several areas. One 


such area would be to write and implement the necessary 
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input/output procedures needed for the Microprocessor Pascal 
tool set, for the VAX-11/780 minicomputer system. This would 
allow much more efficient work with the tool set, than the 
indirect speed estimates which were done here. 

Additional work would also be useful to determine how 
much improvement might be gained by use of the pipelining 
and/or parallel schemes described. It might be possible to 
develop a means of determining exactly when pipelining ог 
parallel processing would be preferable. 

Finally, it would be useful to develop an actual 
1/50A/F9450 hardware system, to allow further work on 
software development. If such a system becomes available, it 
would be possible to test the accuracy of the timing 


calculations done in this thesis. 
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Figure 14. Parallel Processing Scheme for Multiprocessing 
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NOTE: THESE MICROPROCESSORS 


OPERATE SIMULTANEOUSLY 
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PROCESSING INPUT IMAGE ARRAY 
(PIPELINING) 
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BACKGROUND 
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OUTPUT BINARY ARRAY 


Figure 15. Series Processing Scheme for Multiprocessing 
(Pipelining) 
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